Speech spectrum restoration based on conditional restricted boltzmann machine
نویسندگان
چکیده
Many speech enhancement algorithms have been proposed for speech restoration from distorted speech. However, if some components of the signal are completely missed or distorted, there is no way for those algorithms to restore the clean speech. Considering that the restricted Boltzmann machine (RBM) is a stochastic version of the Hopfield network which can be used as an associative memory, we propose to use its “recall” ability for speech spectrum restoration when some parts of the speech spectrum are completely missed or distorted. Traditionally, in training the RBM, speech spectral patches are randomly selected as input. There is no consideration of the temporal correlation between different input spectral patches. In this study, we further propose to model this temporal correlation by using a conditional RBM (CRBM). The inference on the CRBM is almost the same as that of on the RBM by only modifying the biases as conditional dynamic biases. We did experiments for clean speech reconstruction and distorted speech restoration based on the trained models. Our experimental results showed that both the RBM and CRBM worked well in restoration task. By incorporating temporal correlation in the CRBM, a further improvement on reconstruction and restoration accuracy was achieved.
منابع مشابه
Enhanced Factored Three-Way Restricted Boltzmann Machines for Speech Detection
In this letter, we propose enhanced factored three-way restricted Boltzmann machines (EFTW-RBMs) for speech detection. The proposed model incorporates conditional feature learning by introducing a multiplicative input branch, which allows a modulation over visible-hidden node pairs. Instead of directly feeding previous frames of speech spectrum into this third unit, a specific algorithm, includ...
متن کاملA Hybrid Algorithm based on Deep Learning and Restricted Boltzmann Machine for Car Semantic Segmentation from Unmanned Aerial Vehicles (UAVs)-based Thermal Infrared Images
Nowadays, ground vehicle monitoring (GVM) is one of the areas of application in the intelligent traffic control system using image processing methods. In this context, the use of unmanned aerial vehicles based on thermal infrared (UAV-TIR) images is one of the optimal options for GVM due to the suitable spatial resolution, cost-effective and low volume of images. The methods that have been prop...
متن کاملPhone Recognition with the Mean-Covariance Restricted Boltzmann Machine
Straightforward application of Deep Belief Nets (DBNs) to acoustic modeling produces a rich distributed representation of speech data that is useful for recognition and yields impressive results on the speaker-independent TIMIT phone recognition task. However, the first-layer Gaussian-Bernoulli Restricted Boltzmann Machine (GRBM) has an important limitation, shared with mixtures of diagonalcova...
متن کاملComplex-Valued Restricted Boltzmann Machine for Direct Speech Parameterization from Complex Spectra
This paper describes a novel energy-based probabilistic distribution that represents complex-valued data and explains how to apply it to direct feature extraction from complex-valued spectra. The proposed model, the complex-valued restricted Boltzmann machine (CRBM), is designed to deal with complex-valued visible units as an extension of the wellknown restricted Boltzmann machine (RBM). Like t...
متن کاملGenerative Acoustic-Phonemic-Speaker Model Based on Three-Way Restricted Boltzmann Machine
In this paper, we argue the way of modeling speech signals based on three-way restricted Boltzmann machine (3WRBM) for separating phonetic-related information and speaker-related information from an observed signal automatically. The proposed model is an energy-based probabilistic model that includes three-way potentials of three variables: acoustic features, latent phonetic features, and speak...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013